Main task

Here are some recent tropical storms that were very severe with severe impacts to humans:

Storm Landfall time Landfall location Some places affected
Hurricane Andrew 0905 UTC 24 August 1992 near Homestead AFB FL Miami, Key Largo, Nassau (Bahamas)
Labor Day Storm late on September 2, 1935 Florida Keys Florida Keys, St. Petersburg, Florida
Cyclone Tracy very early December 25, 1974 Darwin, Australia Darwin
Tropical Storm Bilis 12:50 p.m. local time July 14, 2006 near Fuzhou, China Guangdong, Hunan, Fujian

You will be trying to collect relevant data for one of these storms.

Relevant NOAA weather products

In later sections, you will have specific goals to try to achieve, in terms of what data to get. However, as you work on this project, please keep a running list here of any of the NOAA data products that you think might have data that could be used to explore the storm you’re working on. Include the name of the dataset, a short description of what you might be able to get from it, and a link to any websites that give more information.

BA: Note: For many of the rnoaa functions, you will need an API key from NOAA. Here’s how you should save that on your computer so that you can get this document to compile without including your API key (these instructions are adapted from here:

  1. Ask NOAA for an API key: http://www.ncdc.noaa.gov/cdo-web/token
  2. Open a new text file. Put the following in it (note: this might not show up here, but there should be a second blank line):
noaakey == [your NOAA API key]
  1. Save this text file as .Renviron in your home directory. Your computer might be upset that the file name starts with a dot. That’s okay– it should; ignore the warning and save like this.
  2. Restart R.
  3. Now you can pull your key using Sys.getenv("noaakey")

Now you can gear up to use functions that require the API key:

options("noaakey" = Sys.getenv("noaakey"))

Relevant other data sources

As you work on this project, also keep a running list here of any data from sources other than NOAA that you think might have data that could be used to explore the storm you’re working on. Include the name of the dataset, a short description of what you might be able to get from it, and a link to any websites that give more information and / or the data itself.

Specific tasks

As you work on these, include the code that you try. Include things that worked and things that you thought would work but that did not. Also write down what parts were easy to figure out how to do, and which you had to search around a lot to figure out what to do.

Winds at landfall

Get any data you can that gives measurements of winds when the storm made landfall (use the landfall listed for the storm in the table– often, there are more than one for a storm for different locations, but just try for the listed one).

Try to get:

  • A measure of how strong winds were over land around where the storm made landfall
  • A measure of how strong winds were over water around where the storm made landfall
  • Wind directions at different locations (on land or over water) near landfall
  • An estimate of how many of the stations near the landfall that were operating the week before the storm that were still operational and recording data at landfall

A measure of how strong winds were over land around where the storm made landfall

BA: It may work, for daily wind speeds, to use the Global Historical Climatology Network Daily data. If you have monitor ids, you can use the meteo_pull_monitors function from rnoaa to pull data from this source. If you have the FIPS code for a county, you can use the function fips_stations in countyweather (a package Rachel is developing) to get the station IDs for all relevant stations in a county.

CP: I had issues running install_github("ropenscilabs/rnoaa") in order to use the function meteo_pull_monitors, so instead try load_all(). Note that this will only work if you have devtools installed and you have forked the rnoaa respository from Github.
For example, the FIPS for Miami-Dade is 12086. Based on the README file for this data, some of the windspeed variables have the abbreviations: “AWND”, “WSFG”, “WSF1”. You can therefore run:

library(devtools)
install_github("ropenscilabs/rnoaa", dependencies=TRUE) # if you need to install the package
## Skipping install for github remote, the SHA1 (608c7cde) has not changed since last install.
##   Use `force = TRUE` to force installation
install_github("leighseverson/countyweather") # if you need to install the package
## Skipping install for github remote, the SHA1 (3e04af3c) has not changed since last install.
##   Use `force = TRUE` to force installation
library(rnoaa)
library(countyweather)
miami_stations <- fips_stations("12086", date_min = "1992-08-01",
                                date_max = "1992-09-01")
miami_stations
## [1] "USC00083909" "USC00084095" "USC00087020" "USC00087760" "USC00088780"
## [6] "USR0000FTEN" "USW00012839" "USW00012859" "USW00092811"

This does not pull any weather stations. However, I know that the station “USW00012839” should exist for this time– it’s the Miami Intl Airport station. I wonder why this isn’t picked up by fips_stations?

Once you know the station number, you can run:

miami_wind <- meteo_pull_monitors("USW00012839", date_min = "1992-08-01",
                                     date_max = "1992-09-01", 
                                     var = c("AWND", "WSFG", "WSF1")) %>%
  mutate(wsfg = wsfg / 10) # Convert to m / s-- see the README for this data from link
head(miami_wind)
## Source: local data frame [6 x 5]
## 
##            id       date  awnd  wsf1  wsfg
##         (chr)     (date) (int) (int) (dbl)
## 1 USW00012839 1992-08-02    38    67   9.8
## 2 USW00012839 1992-08-03    27    94  12.9
## 3 USW00012839 1992-08-04    34    76   9.8
## 4 USW00012839 1992-08-05    33    54   7.7
## 5 USW00012839 1992-08-06    33    98  11.3
## 6 USW00012839 1992-08-07    48    80  10.3
ggplot(miami_wind, aes(x = date, y = wsfg)) + 
  geom_line() + ggtitle("Wind gust speeds near Miami, FL, \nduring Hurricane Andrew") + 
  ylab("Wind gust (m / s)") 

It seems like we should be able to get wind data from some of the other NOAA data sources. In particular, it seems like we should be able to get hourly, or another fine temporal resolution, of wind data. Check the “Data sources” section of the file at the bottom of the page here.

RS: The isd() function from the rnoaa package looks like it’s getting hourly data. Here’s an example with the station at Miami International Airport.

CP: A current issue is that the isd() function takes USAF and WBAN station codes as input, whereas we would like to use FIPS codes (county codes) as input. My initial thought for a way around this is to use the isd_stations_search from the rnoaa package, which takes lat/lon and radius (km) as input. I can use census data from here to find population weighted lat/lon associated with a given FIPS code, and then use the lat/lon as input to isd_stations_search to get the stations associated with that FIPS code. Note that we would like to produce output similar to weather_fips, a function that takes a county code as input and then gives weather data in the surrounding area. weather_fips, except for hourly data. The weather_fips code can be forked from here.

library(stringr)

census_data <- read.csv('http://www2.census.gov/geo/docs/reference/cenpop2010/county/CenPop2010_Mean_CO.txt')
state <- census_data$STATEFP
county <- census_data$COUNTYFP

state[str_length(state) == 1] <- paste0(0, state[str_length(state) == 1])
county[str_length(county) == 1] <- paste0(00, county[str_length(county) == 1])
county[str_length(county) == 2] <- paste0(0, county[str_length(county) == 2])

FIPS <- paste0(state,county)
census_data$FIPS <- FIPS

lat <- census_data$LATITUDE
lon <- census_data$LONGITUDE

#the idea is to go from FIPS -> population weighted lat/lon -> station ID (USAF and WBAN) -> hourly weather data

#this is the FIPS for Miami-Dade 
FIPS_test = "12086"
row_num = which(grepl(FIPS_test, census_data$FIPS))

lat_FIPS = lat[row_num]
lon_FIPS = lon[row_num]

#Get the stations within a 50 km radius corresponding to the 
#population weighted lat/lon of FIPS_test.
#It is also possible to specify a bounding box, instead of radius here.
stations <- isd_stations_search(lat=lat_FIPS, lon=lon_FIPS, radius = 50)
## Assuming 'lon' and 'lat' are longitude and latitude, respectively
#stations contains usaf and wban codes in the county given by FIPS_test.
USAF_codes = stations$usaf
WBAN_codes = stations$wban

#We can apply the isd() function to each of these stations to get the average hourly 'weather' data in the area. 

#here is just an example of what the data looks like for a single station (Miami International Airport)
res <- isd(usaf = USAF_codes[1], wban = WBAN_codes[1], year = 1992)$data
## <path>~/.rnoaa/isd/722020-12839-1992.gz
res$date_time <- ymd_hm(sprintf("%s %s", as.character(res$date), res$time))

res <- res %>% filter(temperature < 900) %>% select(usaf_station, wban_station, 
                                                    date_time, latitude, longitude, 
                                                    wind_direction, wind_speed, 
                                                    temperature)

ggplot(res, aes(date_time, temperature)) +
  geom_line() +
  facet_wrap(~usaf_station, scales = "free_x")

RS: We should be able to filter the res dataframe to the particular month we’re interested in to get hourly wind data. We found the USAF and WBAN codes for Miami by searching this text file of stations - there should be a better way to find USAF and WBAN codes for a particular location or station.

CP: I found a function to go from location (lat/lon) to USAF and WBAN codes isd_stations_search, and use census data to go from FIPS to lat/lon. See above code.

More info about this data can be found by going to NOAA’s land-based station data site, then following the link for Integrated Surface Hourly Data base (3505) at the bottom of the page. For example, the readme.txt and ish-format-document.pdf files could be helpful.

A measure of how strong winds were over water around where the storm made landfall

BA: For this, it seems like buoy data might work. The buoy function in rnoaa might work. I’m not sure how you could get station IDs through rnoaa for the buoy stations. I used the website to find a station close to Miami (“fwyf1”). Once you have a buoy id, you can do this kind of call:

ex <- buoy(dataset = 'cwind', buoyid = "fwyf1", year = 1992)
## Using c1992.nc
head(ex$data)
##                   time   lat   lon wind_dir wind_spd
## 1 1991-12-31T23:10:00Z 25.59 -80.1       19     10.3
## 2 1991-12-31T23:20:00Z 25.59 -80.1       19     10.3
## 3 1991-12-31T23:30:00Z 25.59 -80.1       12      9.3
## 4 1991-12-31T23:40:00Z 25.59 -80.1       12     10.1
## 5 1991-12-31T23:50:00Z 25.59 -80.1        6      9.9
## 6 1992-01-01T00:00:00Z 25.59 -80.1        7      9.3
fl_buoy <- ex$data %>% mutate(time = ymd_hms(time)) %>%
  filter(time > ymd_hms("1992-08-23 00:00:00") & 
           time < ymd_hms("1992-08-26 00:00:00"))
ggplot(fl_buoy, aes(x = time, y = wind_spd)) + geom_line()

It looks like this buoy was actually lost during the hurricane. It stops recording during the worst of the storm, and it doesn’t look like it got back online in 1992 after that.

It looks like you need to look through the historical station maps to find a station, rather than the current ones. If there’s somewhere we could get a list of all buoy ids by latitude and longitude, we could identify the ones closest to a location.

Wind directions at different locations (on land or over water) near landfall

An estimate of how many of the stations near the landfall that were operating the week before the storm that were still operational and recording data at landfall

Precipitation at affected cities

For each of the affected cities, estimate the precipitation during the storm and on neighboring days. Include:

  • Daily and hourly estimates of rainfall
  • How many stations you used to get each of those values
  • A map of stations you used to get those values, with some measure on the map of the maximum daily or hourly rainfall measured at that station

Daily and hourly estimates of rainfall

RS: The countyweather package has some functions for pulling and aggreagating weather data by FIPS code. For the US storms, this might be helpful for this task. If you have devtools installed, you can install countyweather using the following code:

# library(devtools)
# install_github("ropenscilabs/rnoaa") # if you need to install the package
# install_github("leighseverson/countyweather") # if you need to install the package
library(rnoaa)
library(countyweather)
miami_stations <- fips_stations(fips = "12086")
miami_rain <- meteo_pull_monitors(miami_stations, var = "PRCP")
range(miami_rain$date)

BG: For some reason, NCDC seems to only be identifying Miami weather monitors that operated from 2007 on.

You can instead use the meteo_nearby_stations function to find all precipitation monitors in a certain radius (parts of this take a long time to run, so I won’t evaluate):

station_data <- ghcnd_stations()[[1]] # Takes a while to run
miami <- data.frame(id = "miami", latitude = 25.7617, longitude = -80.1918)
miami_monitors <- meteo_nearby_stations(lat_lon_df = miami,
                                        station_data = station_data,
                                        radius = 50,
                                        var = c("PRCP"), 
                                        year_min = 1992, year_max = 1992)
miami_monitors <- miami_monitors[[1]]$id
miami_weather <- meteo_pull_monitors(miami_monitors, date_min = "1992-08-01",
                                     date_max = "1992-09-01", 
                                     var = c("PRCP")) 
ggplot(miami_weather, aes(x = date, y = prcp, color = id)) + 
  geom_line() + ggtitle("Precipitation near Miami, FL, \nduring Hurricane Andrew") + 
  ylab("Precipitation (mm)")

##Cyclone Tracy
library(ggmap)
darwin_ll <- geocode("Darwin, Australia")
darwin <- data.frame(id = "darwin", 
                     latitude = darwin_ll$lat, 
                     longitude = darwin_ll$lon)
darwin_monitors <- meteo_nearby_stations(lat_lon_df = darwin,
                                         lat_colname = "latitude",
                                         lon_colname = "longitude",
                                         station_data = station_data,
                                         radius = 50,
                                         var = "PRCP",
                                         year_min = 1974, year_max = 1974)
darwin_monitors <- darwin_monitors[[1]]$id

-JF: darwin_monitors includes 25 possible monitors for the year 1974. However, when using the function meteo_pull_monitors, data was pulled for only 10 of these possible 25 monitors. This may be due to the date restriction used to pull data for this storm, and 15 monitors simply do not have available data from this time frame. No error message was given.

ggplot(darwin_weather, aes(x = date, y = prcp, color = id)) +
  geom_line() + ggtitle("Precipitation near Darwin, Australia, \nduring Cyclone Tracy") +
  ylab("Precipitation (mm)")
## Warning: Removed 23 rows containing missing values (geom_path).

##Labor Day Storm
keys <- geocode("Florida Keys, St. Petersburg, FL")
keys <- data.frame(id = "keys", latitude = keys$lat, longitude = keys$lon)
keys_monitors <- meteo_nearby_stations(lat_lon_df = keys,
                                         lat_colname = "latitude",
                                         lon_colname = "longitude",
                                         station_data = station_data,
                                         radius = 50,
                                         var = "PRCP",
                                         year_min = 1935, year_max = 1935)
keys_monitors <- keys_monitors[[1]]$id
keys_weather <- meteo_pull_monitors(keys_monitors, date_min = "1935-08-15",
                                      date_max = "1935-09-15",
                                      var = "PRCP")
ggplot(keys_weather, aes(x = date, y = prcp, color = id)) +
  geom_line() + ggtitle("Precipitation near the Florida Keys, Fl, \nduring the Labor Day Storm") +
  ylab("Precipitation (mm)")

fuzhou <- geocode("Fuzhou, China")
fuzhou <- data.frame(id = "fuzhou", latitude = fuzhou$lat, longitude = fuzhou$lon)
fuzhou_monitors <- meteo_nearby_stations(lat_lon_df = fuzhou,
                                         lat_colname = "latitude",
                                         lon_colname = "longitude",
                                         station_data = station_data,
                                         radius = 50,
                                         var = "PRCP",
                                         year_min = 2006, year_max = 2006)
fuzhou_monitors <- fuzhou_monitors[[1]]$id
  • JF: Looks like there is only one station in/near Fuzhou, China, within a 50 mile radius.
fuzhou_weather <- meteo_pull_monitors(fuzhou_monitors, date_min = "2006-07-01",
                                      date_max = "2006-08-01",
                                      var = "PRCP")
ggplot(fuzhou_weather, aes(x = date, y = prcp, color = id)) +
  geom_line() + ggtitle("Precipitation near the Fuzhou, China, \nduring Tropical Storm Bills") +
  ylab("Precipitation (mm)")

How many stations you used to get each of those values

A map of stations you used to get those values, with some measure on the map of the maximum daily or hourly rainfall measured at that station

Plotting the storm

Plot the full track of the storm. Show its intensity as it progressed along the track, as well as the affected cities.